Search CORE

44 research outputs found

Learning Preferences with Kernel-Based Methods

Author: Tsivtsivadze Evgeni
Publication venue: Turku Centre for Computer Science
Publication date: 21/04/2009
Field of study

Learning of preference relations has recently received significant attention in machine learning community. It is closely related to the classification and regression analysis and can be reduced to these tasks. However, preference learning involves prediction of ordering of the data points rather than prediction of a single numerical value as in case of regression or a class label as in case of classification. Therefore, studying preference relations within a separate framework facilitates not only better theoretical understanding of the problem, but also motivates development of the efficient algorithms for the task. Preference learning has many applications in domains such as information retrieval, bioinformatics, natural language processing, etc. For example, algorithms that learn to rank are frequently used in search engines for ordering documents retrieved by the query. Preference learning methods have been also applied to collaborative filtering problems for predicting individual customer choices from the vast amount of user generated feedback. In this thesis we propose several algorithms for learning preference relations. These algorithms stem from well founded and robust class of regularized least-squares methods and have many attractive computational properties. In order to improve the performance of our methods, we introduce several non-linear kernel functions. Thus, contribution of this thesis is twofold: kernel functions for structured data that are used to take advantage of various non-vectorial data representations and the preference learning algorithms that are suitable for different tasks, namely efficient learning of preference relations, learning with large amount of training data, and semi-supervised preference learning. Proposed kernel-based algorithms and kernels are applied to the parse ranking task in natural language processing, document ranking in information retrieval, and remote homology detection in bioinformatics domain. Training of kernel-based ranking algorithms can be infeasible when the size of the training set is large. This problem is addressed by proposing a preference learning algorithm whose computation complexity scales linearly with the number of training data points. We also introduce sparse approximation of the algorithm that can be efficiently trained with large amount of data. For situations when small amount of labeled data but a large amount of unlabeled data is available, we propose a co-regularized preference learning algorithm. To conclude, the methods presented in this thesis address not only the problem of the efficient training of the algorithms but also fast regularization parameter selection, multiple output prediction, and cross-validation. Furthermore, proposed algorithms lead to notably better performance in many preference learning tasks considered.Siirretty Doriast

UTUPub

From ranking to intranstivie preference learning : rock-paper-scissors and beyond

Author: De Baets Bernard
Pahikkala Tapio
Salakoski Tapio
Tsivtsivadze Evgeni
Waegeman Willem
Publication venue
Publication date: 01/01/2009
Field of study

Ghent University Academic Bibliography

Premise Selection for Mathematics by Corpus Analysis and Kernel Methods

Author: A Grabowski
A Riazanov
A Tarski
A Trybulec
B Scholkopf
CM Bishop
Daniel Kühlwein
E Tsivtsivadze
Evgeni Tsivtsivadze
J Meng
J Shawe-Taylor
J Urban
J Urban
J Urban
Jesse Alama
Josef Urban
MD Richard
P Rudnicki
R Rifkin
S Schulz
S Shalev-Shwartz
Tom Heskes
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/04/2012
Field of study

Smart premise selection is essential when using automated reasoning as a tool for large-theory formal proof development. A good method for premise selection in complex mathematical libraries is the application of machine learning to large corpora of proofs. This work develops learning-based premise selection in two ways. First, a newly available minimal dependency analysis of existing high-level formal mathematical proofs is used to build a large knowledge base of proof dependencies, providing precise data for ATP-based re-verification and for training premise selection algorithms. Second, a new machine learning algorithm for premise selection based on kernel methods is proposed and implemented. To evaluate the impact of both techniques, a benchmark consisting of 2078 large-theory mathematical problems is constructed,extending the older MPTP Challenge benchmark. The combined effect of the techniques results in a 50% improvement on the benchmark over the Vampire/SInE state-of-the-art system for automated reasoning in large theories.Comment: 26 page

arXiv.org e-Print Archive

Crossref

Radboud Repository

Molecular Machines in the Synapse: Overlapping Protein Sets Control Distinct Steps in Neurosecretion

Author: A Gulyas-Kovacs
A Maximov
A Vazquez
AG Fraser
AJ Groffen
AM Walter
Evgeni Tsivtsivadze
G Giaever
H de Wit
H Liu
I Augustin
I Dulubova
J Rizo
Jason M. Haugh
JB Sorensen
JJ Chua
JM Bekkers
K Ikeda
K Ikeda
K Reim
KD Wierda
L. Niels Cornelisse
LF Abbott
M Boutros
M Geppert
M Xue
M Xue
Marieke Meijer
Matthijs Verhage
NB Fredj
O Kochubey
R Fernandez-Chacon
R Schneggenburger
RD Emes
RF Toonen
RW Cho
S Huntwork
S Schoch
S Takamori
SM Young Jr
SS Lee
SS Mathew
TC Sudhof
Tjeerd M. H. Dijkstra
Tom Heskes
TW Groemer
WJ Jockusch
X Lou
X Wang
Y Ohya
Y Sara
Z Hua
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Activity regulated neurotransmission shapes the computational properties of a neuron and involves the concerted action of many proteins. Classical, intuitive working models often assign specific proteins to specific steps in such complex cellular processes, whereas modern systems theories emphasize more integrated functions of proteins. To test how often synaptic proteins participate in multiple steps in neurotransmission we present a novel probabilistic method to analyze complex functional data from genetic perturbation studies on neuronal secretion. Our method uses a mixture of probabilistic principal component analyzers to cluster genetic perturbations on two distinct steps in synaptic secretion, vesicle priming and fusion, and accounts for the poor standardization between different studies. Clustering data from 121 perturbations revealed that different perturbations of a given protein are often assigned to different steps in the release process. Furthermore, vesicle priming and fusion are inversely correlated for most of those perturbations where a specific protein domain was mutated to create a gain-of-function variant. Finally, two different modes of vesicle release, spontaneous and action potential evoked release, were affected similarly by most perturbations. This data suggests that the presynaptic protein network has evolved as a highly integrated supramolecular machine, which is responsible for both spontaneous and activity induced release, with a group of core proteins using different domains to act on multiple steps in the release process

Directory of Open Access Journals

PubMed Central

Radboud Repository

FigShare

Graph kernels versus graph representations: a case study in parse ranking

Author: Evgeni Tsivtsivadze
Jorma Boberg
Tapio Pahikkala
Tapio Salakoski
Publication venue
Publication date
Field of study

Abstract. Recently, several kernel functions designed for a data that consists of graphs have been presented. In this paper, we concentrate on designing graph representations and adapting the kernels for these graphs. In particular, we propose graph representations for dependency parses and analyse the applicability of several variations of the graph kernels for the problem of parse ranking in the domain of biomedical texts. The parses used in the study are generated with the link grammar (LG) parser from annotated sentences of BioInfer corpus. The results indicate that designing the graph representation is as important as designing the kernel function that is used as the similarity measure of the graphs.

CiteSeerX